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Abstract 

In this paper we study linear projection methods for approximating the solution 
and simultaneously preserving first integrals of autonomous ordinary differential 
equations. We show that (linear) projection methods are a subset of discrete 
gradient methods. In particular, each projection method is equivalent to a class 
of discrete gradient methods (where the choice of discrete gradient is arbitrary) 
and earlier results for discrete gradient methods also apply to projection methods. 
Thus we prove that for the case of preserving one first integral, under certain mild 
conditions, the numerical solution for a projection method exists and is locally 
unique, and preserves the order of accuracy of the underlying method. 

In the case of preserving multiple first integrals the relationship between projec- 
tion methods and discrete gradient methods persists. Moreover, numerical exam- 
ples show that similar existence and order results should also hold for the multiple 
integral case. 

For completeness we show how existing projection methods from the literature 
fit into our general framework. 

1 Introduction 

First, consider an autonomous ordinary differential equation (ODE) with only one first 
integral. We will consider the case of multiple first integrals later. We consider the 
same problem as in [14]: 

± = f(x) t>0, (1) 

where x(t) G M d for some d G N, x(0) = x$ G M. d is the initial condition and / : M. d — >• R rf 
is locally Lipschitz continuous. Existence theory for ODEs (see e.g. [9, Thm. 1.7.3 on 
p. 37]) implies that given a bounded set B C there exists a T > such that for any 
xq G B the solution exists and remains bounded for t G [0, T]. We assume that (1) has 
a conserved first integral I : M. d — > R so that 

x(t) G Mx ■= {zeR d : I(z) = I(x )} for all t G [0,T]. 

To simplify the notation define i := VI and let us also assume that / is a Morse 
function (i.e. smooth with non-degenerate critical points) and that i : R d -> R d is 
locally Lipschitz continuous. As in [12], and discussed in detail in [14], if i(x(t)) ^ 
for t > then we may write (1) as 

x = S(x)i(x) (2) 
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where S : R d — > R dxd is a skew-symmetric (S T = —S) matrix-valued function. In 
general, S is not unique. One choice for S is the so-called default formula, 

q( \ _ f( x )K x ) T - K x )f{x) T 

Since / is a Morse function, the default S is locally bounded on {x G R d : i(x) / 0} 
and for a bounded set B cR d there exists a constant C\ = C\{B) such that 

|/(s)| < CiK(x)| for all xG 5. (4) 

Also define C 2 = C 2 {B) := d + \. 

In general it will be beneficial to approximate the solution to (1) in such a way so 
that / is preserved exactly (in practice up to round off error or a specified tolerance) 
by the approximate solution. Both (linear) projection methods (see e.g. [8, §IV.4 and 
§V.4.1] and references therein) and discrete gradient methods (see e.g. [12, 17, 20]) 
are types of methods that achieve this. In the special case of Hamiltonian systems one 
must choose whether to preserve the Hamiltonian integral or the symplectic structure 
(only the exact solution up to time rescaling preserves both, see e.g. [23]), but there 
are many examples where preserving the Hamiltonian is advantageous, see e.g. [19, 21]. 

First, let us define linear projection methods. The basic idea of a projection method 
is to couple a one-step method with a projection so that after a full time step the 
approximate solution to the ODE lies on the manifold M xo . Let f :R d xR d x [0, oo) — > 
R d define an arbitrary one-step method applied to (1) with time step h, so that 



x' — x 



f(x,x',h) (5) 

where x = x n and x' = x n +i at each time step 1 . Then, one step of a linear projection 
method (c.f. [8, Algorithm IV. 4. 2]) x h-> x' is defined by: Given and h G [0, oo), 

1. compute y such that y = x + hf(x, y, h), 

2. compute x' G M x by projecting y onto M x . 

In this paper we will only concern ourselves with linear projections so that step 2 of 
the above algorithm is given by: 

2. compute x' G M x by solving x' = y + Xi(x,x', h) and I(x') = I(x) for x' G R d 
and Ael, 

where i : R d x R d x [0, oo) -> R d is a vector field that defines the direction of the 
projection and is typically an approximation of i. We refer to this type of projection 
as a linear projection because (x' — y) \\ i(x,x', h). 

Note that for a method defined by y = x + hf(x, y, h) in step 1 of the algorithm 
above, there exists an implicitly defined map $ /t : R d ->• R d such that y = $ h (x). If 
we define g(x,h) := (<3?h(x) — x)/h, then we may alternatively write step 1 as y = 
x + hg(x,h). Using g instead of / in step 1 allows us to easily eliminate y from the 
algorithm and express the algorithm in a single line: Given x G R d and h G [0, oo) 
compute x' G R d and A G R such that 

x' = x + hg(x, h) + \i(x, x', h) and I{x') = I{x). 



x It will be our convention to let x — x n (the approximate solution at step n) and x' = x n +i- 
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For more generality in our projection methods, in addition to allowing different choices 
of i, we will also modify g so that it may also depend on x' . Switching back to using / 
instead of g in the notation we get our general form of a linear projection method for 
preserving a single first integral: Given x <G R d and h € [0, oo) compute x' € R d and 
AGt such that 

x' = x + hf(x,x', h) + Xi(x, x', h) and I(x') = I(x). (6) 

By choosing / and i differently, we obtain different projection methods. To the best of 
our knowledge all of the projection methods that have been described in the literature 
fit into this framework (we are only aware of linear projection methods but it may be 
possible to define projection methods in spaces that are not linear spaces), including 
the (non-symmetric) standard projection method in [8, Algorithm IV. 4. 2] and the sym- 
metric projection method in [8, §V.4.1] and [7]. This will be discussed in more detail 
in Section 4. 

The other type of integral preserving methods we consider are discrete gradient 
methods. For their definition we must first define a discrete gradient of I - a special 
type of discretization of the gradient of /. 

Definition 1. (Gonzalez [5]) A discrete gradient of I, denoted i : l d xR d 4 R d , is 
continuous and satisfies 

i(x, x') ■ (x' — x) = I(x') — I(x) and i(x, x) = i{x) for all x, x' G R d . 

Formulae for constructing discrete gradients include the one used in the average 
(or averaged) vector field method (called mean value discrete gradient in [12], also see 
[19]) and the coordinate increment method [11]. 

If we let i be a discrete gradient of / and S : R d x R d x [0, oo) -> R dxd be a 
skew symmetric continuous and consistent approximation of S then a discrete gradient 
method for solving (1) is defined by the mapping x >->• x' where 

, jx + hS(x,x',h)i(x,x') if i(x) ^ 0, 
\x if i(x) = 0. 

In this paper we only consider the large class of discrete gradients where S is defined 
by the formula 

~ , _ fix, x', h)i(x, x', h) T - i(x, x', h)f(x, x', h) T 
b(x,x,n):- - ~ — , (»J 

i{x, x , h) ■ i(x, x, h) 

where / : R d x R d x [0, oo) — )• R d is a continuous consistent approximation of / and 
i, i and i are all maps from R d x R d x [0, oo) to R d and are continuous consistent 
approximations of i. 

All discrete gradient methods of this type preserve / because 

I(x') — I(x) = i{x, x') ■ (x' — x) = h(i(x, x')) T S(x, x' , h)i(x, x') = 

for all h, x and x' satisfying (7), where i is a discrete gradient of /. The final equality 
follows from the fact that S is skew symmetric. 



3 



In [14] discrete gradient methods of this type were studied and it was shown that 
under certain local Lipschitz conditions and for sufficiently small time step the numer- 
ical solution to (7) (with S defined by (8)) exists and is locally unique, independent 
of the distance to critical points of /. For arbitrary p £ N it was also shown how to 
construct discrete gradient methods that have order of accuracy p. 

In this paper we will show that all linear projection methods of the type (6) are 
equivalent to discrete gradient methods where the approximate solution is independent 
of the particular choice of discrete gradient i. ..We prove this by showing that each 
projection method is equivalent to (generally) several discrete gradient methods, in the 
sense that a projection method and several discrete gradient methods (defined with 
different choices for i) all define the same map x i-> x' for a given h. A consequence of 
this result is that projection methods are a subset of discrete gradient methods. 

In this case when there is only one first integral to preserve, we can then use the 
theory in [14] to obtain by simple corollary new existence, uniqueness and order of 
accuracy results for a large number of linear projection methods (only restricted by 
certain mild local Lipschitz conditions on / and i). 

When there is more than one first integral to preserve, we will prove that the 
same equivalence between discrete gradient and linear projection methods holds, and 
as a consequence projection methods are a subset of discrete gradient methods for the 
multiple integral situation. Since the theory in [14] is only for the single first integral 
case we do not obtain new results about existence, uniqueness and order of accuracy 
from discrete gradient method theory for the multiple first integral case. Proving these 
results for general linear projection methods and discrete gradient methods in the 
multiple integral case remains an open problem. 

The remainder of this paper is organised as follows. In Section 2 we prove our first 
result about the equivalence of linear projection methods and a class of discrete gradient 
methods in the case when (1) has a single first integral. Then, in Section 3 we use this 
result and the theory from [14] to get new results about existence, local uniqueness, 
and order of accuracy for linear projection methods. In Section 4 we demonstrate 
how several projection methods already described in the literature are special cases in 
our framework and how our new results improve on existing results by allowing more 
freedom on the projection direction than previously, and our results are independent 
of the distance to critical points of /. In Section 5 we then consider the case when 
(1) has more than one first integral and our projection and discrete gradient methods 
are designed to preserve multiple first integrals. We write down a new expression for 
linear projection methods in this case involving oblique projection matrices and prove 
equivalence with discrete gradient methods. Using numerical experiments we illustrate 
how the order of accuracy results, proven in the single first integral case, also appear 
to hold in the multiple integral case. Finally, in Section 7 we discuss the implications 
of this work and possible avenues for future research. 

2 Equivalence in the single first integral case 

In this section we explore the relationship between linear projection methods and dis- 
crete gradient methods. We will see that each linear projection method is equivalent 
to possibly several discrete gradient methods where the choice of discrete gradient is 
arbitrary. Note, however, that discrete gradient methods are not always projection 
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methods so that projection methods are a subset of discrete gradient methods. 

So far we have not proven that the projection method defined by (6) is well-defined 
in the sense that the implicit system of equations (6) for x' and A has a unique solution 
for sufficiently small time step h. So let us assume that h is sufficiently small and x' and 
A are uniquely defined by (6) and that i-i ^ (definitions of i and i will be given). In 
the next section we will prove an existence result that justifies these two assumptions 
under sufficient conditions for /, i, i and h. 

The following theorem shows that linear projection methods may be expressed in 
several equivalent ways and the following corollary explains how each linear projection 
method is equivalent to possibly several discrete gradient methods where the choice of 
discrete gradient method is arbitrary. 

Theorem 2. Let i : W 1 x U. d x [0, oo) — > M. d be a consistent approximation of i, let i be 
an arbitrary discrete gradient of I, and let f : R d x M. d x [0, oo) — > M. d be a consistent 
approximation of f . 

Assuming that given x € M. d and h € [0, oo), each of the methods below have uniquely 
defined x' and A, and that i-i ^Q, then they define the same linear projection method. 

x = x + hf(x, x', h) + Xi(x, x', h) such that I (x ) = I (x), (9) 
x' = x + hP(x, x', h)f(x, x', h) where P(x, x' , h) = I - |sgggjMgl (10) 
x = x + hS(x, x', h)i(x, x ) where s(x,x f ,h) 



i(x,x' ,h)-i(x,x f ,h) 

(11) 

Proof. For given iel d and h € [0, oo) suppose x' € M. d and A G K satisfy (9). Take 
the inner product of (9) with i(x, x') to get 

- - - - /•« 
= I{x) — I(x) = i ■ {x' — x) = hf ■ i + \i ■ i, hence A = — h- 



i ■ i 



Substituting this into (9) we find that 



x' = x + hf - hL-li = x + h(f- % J— f ) =x + hPf, 
i-i V i i 



i T i ~ ii T ~\ I f i T - if T 



and 



x' = x + hPf = x + h U=/ - ~=/ = x + h 4— i - 4-i = x + hSi. 

yi 1 1 i 1 1 J \ i i i t I 

Thus, if x' and A satisfy (9) then x' satisfies (10) and (11). To see the converse, note 
that if x' satisfies (10) or (11) then I(x') = I{x) (take the inner product of (10) or (11) 
with i(x, x') and use the definition of a discrete gradient). Define A := — h(f ■ i)/(i ■ i), 
then x' and A satisfy (9). □ 

Note that (9) is the same as (6), our general form for a linear projection method. 
In (10), P is a projection matrix satisfying Pf _L i and (/ — P)f \\ i, i.e. the range of 
P is spanji} -1 - and the null space of P is span{i}. 

Using the equivalence of (9) and (11) we get the following corollary. 
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Corollary 3. Let i : R x R x [0, oo) —> M. be a consistent approximation of i, let 
/ : l 1 * x l d x [0, oo) — >■ M. d be a consistent approximation of f and let i be an arbitrary 
discrete gradient of I. If we define i = i and i = i (or vice versa), then the linear 
projection method defined by (6) is equivalent to the discrete gradient method defined 
by (7) where S is defined by (8). 

Defining i = i and i = i in the definition of a discrete gradient method is a restriction 
so linear projection methods are a subset of all possible discrete gradient methods. 

Also notice that the methods described by (10) and (11) depend on an arbitrarily- 
chosen discrete gradient i, whereas linear projection methods are independent of i. At 
first glance it would appear that the mapping x x' defined by (10) and (11) should 
depend on the choice of i and these methods would give different approximations to 
(1) for different choices of i. It is perhaps surprising that this is not the case, and 
(as a consequence of Theorem 2 since (9) does not depend on i) they give the same 
approximation to (1) regardless of how i is chosen. Thus, each linear projection method 
defines an equivalence class of discrete gradient methods, and is uniquely defined by 
choosing / and the direction of projection given by i. 

3 Existence, uniqueness and order of accuracy 

In this section we will exploit the equivalence between linear projection methods and 
discrete gradient methods by using theory developed for discrete gradient methods to 
prove new results about linear projection methods. 

Typically, the projection step of a projection method (step 2 in our original algo- 
rithm) requires solving an implicit nonlinear system of equations, and a new system 
of equations must be solved at each time step. A basic question regarding projection 
methods is: Does there exist a unique solution to each of these systems of equations? 
A further question is: Does a projection method retain the same order of accuracy as 
the underlying method (the underlying method is step 1 in our original algorithm)? 

Linear projection methods have already been studied in the literature (see e.g. [8, 
§IV.4 and §V.4.1] and [7]) and questions of existence and uniqueness, and order of 
accuracy have already been answered in some cases. However, these results were only 
stated for particular special cases of i (see Section 4) and their proofs rely on either 
a simple geometric argument (which only holds for the standard projection method 
when i(x,x',h) := i(x')), the Implicit Function Theorem, or the Newton-Kantorovich 
Theorem. Closer examination of these techniques - with the assistance of results in [16] 
and [15] that give a lower bound on the radius of existence for the Implicit Function 
Theorem and the Newton-Kantorovich Theorem - reveals that the time step restriction 
on h (or radius of existence) for existence of the numerical solution is h < C\i(x)\ r for 
some positive constants C and r. If x is near to a critical point of / (so that i{x) ~ 0) 
then this type of restriction is undesirable and in numerical simulations it appears to be 
unnecessary. Our new results below are an improvement and extension on these earlier 
results because we avoid this restriction, and we only place mild Lipschitz continuity 
conditions on the projection direction i so that the results hold for a much wider class 
of projection methods. 

For the following results we require the following definition of a ball around a point 
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x G R d . Given x £R d and a constant R > define 

B R [x) :={zeR d :\z-x\< l -^y 

Note that when i(x) = we have Br{x) = {x}. 

To simplify the presentation that follows let us define several 'Assumptions'. 

Assumption 1. Given a bounded set B C R d , we say that f : R d x R d x [0, oo) — s> R d 
satisfies Assumption 1 for positive constants R, L and H if 

f{x,x,0) = f(x) 
\f(u,v,h) - f(w,v,h)\ < L\u - w\ 
\f(u,v,h) - f(u,w,h)\ < L\v - w\ 
\f(x,x,h) - f(x,x,Q)\ < Lh\i(x)\, 

for all u,v,w G Br(x), x € B and h £ [0, H). 

Assumption 2. Given a bounded set B C R d , we say that i : R d x R d x [0, oo) — > R d 
satisfies Assumption 2 for positive constants R, L and H if 

i(x, x, 0) = i(x) 
\i(u,v,h) — i(w,v,h)\ < L\u — w\ 
\i(u, v, h) — i(u, w,h)\ < L\v — w\ 
\i(x, x, h) — i(x, x, 0)| < Lh\i(x)\, 

for all u,v,w £ Br(x), x G B and h G [0, H). 

We say that i : R d x R d —> R d satisfies Assumption 2 for R and L if i : R d x R d x 
[0, oo) — > R d defined by i(x,x',h) = i(x,x') for all x,x' G R d and h G [0, oo) satisfies 
Assumption 2 for R, L and any positive constant H. 

Assumption 3. Given a bounded set B C R d , we say that g : R d — s> R d satisfies 
Assumption 3 for positive constants R and L if 

\g(u) - g{v)\ < L\u - v\ 

for all u,v G Br(x) and all x G B. 

Note that since we have assumed that / is locally Lipschitz continuous, for any 
R > there exists a corresponding L > such that / satisfies Assumption 3. Similarly 
for i. 

The following theorem ensures for sufficiently small h and under certain local Lip- 
schitz continuity conditions, that linear projection methods (defined by (6)) have a 
numerical solution that is locally unique. Its proof is omitted because it is a direct 
consequence of Theorem 2.1 in [14] and Corollary 3 where i is chosen to be an arbi- 
trary discrete gradient of / satisfying Assumption 2 for R, L and H defined as in the 
theorem below. 

Theorem 4. Let B be a bounded set in R d , let C2 be the constant defined by (4), and 
suppose that R, L and H are positive constants such that 
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1. f:R d xR d x [0, oo) — y R d satisfies Assumption 1 for R, L and H, and 

2. i : R d x R d x [0, oo) -> R d satisfies Assumption 2 for R, L and H. 
Define 

Rl := max{i?, 10L} and iT := min |g, ^ p^p^ } ■ 

T/ien /or eac/i x £ B and h G [0, ii 7 ), i/iere exists a unique x' G Bri{x) and A G R swc/i 
i/iai |A| < T^C^/t satisfying (6). 

This existence result only provides us with local uniqueness since we are only sure 
that x' is unique in the ball Br/(x) C R d . 

Now let us consider the order of accuracy of linear projection methods. We use the 
following definition for order of accuracy, which is similar to [4, Def. V.1.3]. 

Definition 5. A one-step method x i— >■ x' with time step h for solving (1) has order of 
accuracy pGN, if for problems with sufficiently smooth f there exist positive constants 
C and H such that 

\x' -x{t + h)\< Ch p+1 for all h G [0, H] and all x G B, 

where x(-) denotes the solution to (1) with x(t) = x for some t > and B is a compact 
set in R d . The constants C and H may depend on B but should be independent of x 
and h. 

If we are given an underlying method that is of order p for some p G N, i.e. the 
method x i-> y defined by y = x + hf(x, y, h) is of order p, then an important question 
to ask is: What additional conditions (in addition to Assumption 2) on i (recall i 
defines the direction of the projection) are required to ensure that a linear projection 
method defined by (6) is also of order pi The following theorem gives the answer: 
none! Besides Assumption 2, there are no additional conditions on i that are required 
to ensure a linear projection method is of order p. 

Again, we rely on theory in [14] to achieve our result. The following theorem is a 
special case of Theorem 3.3 in [14] using Corollary 3 and an arbitrary discrete gradient 
i satisfying Assumption 2. 

Theorem 6. For a compact set B C R d , let C2, R, L, H, f, i, R' and H' be defined 
as in Theorem 4 and let f satisfy Assumption 3 for 5R' and L. For each x G B and 
h G [0, H') 

1. let x' G Bri{x) and A G R be the unique solution to (6) such that [A| < ^C^h 
(which exists by Theorem 4), 

2. let y G B§r> (x) be the unique solution to y = x + hf(x, y, h) (which exists by [14, 
Lem. 3.1]), and 

3. let x(-) denote the exact solution to (1) satisfying x(t) = x for some t > 0. 
Also suppose that 
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4- f is such that the method x *— > y defined by y = x + hf(x, y, h) is of order p for 
some p€N, i.e. when f is sufficiently smooth there exist positive constants C3 
and H3 < H' such that 

\y - x(t + h)\ < C 3 h p+1 for all h £[0,H 3 ] and all x G B. (12) 

Then the linear projection method defined by (6) is also of order p, so that when f is 
sufficiently smooth there exist positive constants C5 and such that 

\x' -x(t + h)\< C 5 h p+1 for all h G [0, H 5 ] and all x G B. 

4 Existing linear projection methods 

In this section we consider several linear projection methods that have been described 
and studied previously in the literature. Our purpose is to show how all of these 
methods are special cases in our general framework, and hence our new theory also 
applies in these cases. 

We will need the following version of Banach's Fixed Point Theorem (also known 
as the Contraction Principle). This version was also used in [14] and is from [10, Thm. 
3.1.2 on p. 74]. 

Theorem 7 (Banach's Fixed Point Theorem). Let (X,d) be a non-empty complete 
metric space. Let T : X — > X be a contraction on X , i.e. there exists a q G (0, 1) such 
that 

d(T(x),T(y)) < qd(x, y) for all x, y £ X. 

Then there exists a unique fixed point x* G X such that T(x*) = x* . Furthermore, the 
fixed point can be found by iteration, x n+1 = T(x n ) for n = 0, 1,2, . . . with x° G X 
arbitrary. 

4.1 Example 1: (non-symmetric) standard projection method 

In our notation, the (non-symmetric) standard projection method described in [8, Al- 
gorithm IV 4.2] for x I—?- x' is defined by 

y = x + hg(x,y,h) 

x' = y + Xi(x') such that L(x') = I(x), 

where g : R d x R d x [0, 00) — s- R d defines a map x i-> y that is an arbitrary one-step 
method applied to (1). Let be the implicitly defined map so that y = ^(x). If we 
define i:R d xR d x [0, 00) R d and f :R d xR d x [0, 00) R d such that 

i(x, x' , h) := i(x') and f(x,x',h):=g(x,<&h(%),h), (13) 

for all x, x' G R d and h G [0, 00) then the standard projection method is a linear 
projection method of the form (6). 

However, for computation the authors of [8] suggest using (6) with i defined by 

i(x,x?,h):=i(y) = i(§ h (x)) (14) 
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for all x, x' G M. d and h G [0, oo), instead of (13) to make the system of equations easier 
to solve at each time step. Strictly speaking, this method with i given by (14) instead 
of (13) is a different projection method because the projection direction is different. 
Let us refer to it as version 2 of the (non-symmetric) standard projection method. 

To apply our new theory in Theorems 4 and 6 we must determine what conditions 
g and i must satisfy to ensure that / and i satisfy Assumptions 1 and 2, respectively, 
for both versions of the standard projection method. First, consider the first version of 
the standard projection method when / and i are defined by (13). We must first prove 
the following lemma about existence, uniqueness and continuity of &h- 

Lemma 8. For a bounded set B C M. d , let C\ be the constant from (4) and suppose 
that g satisfies Assumption 1 for some positive constants R g , L g and H g . 

If u G B2R g {x) and h < mm{H g , gj-, 4 ^ Cl+ \/^ R }, then there exists a unique 
y = &h(u) G Br 3 (x) satisfying y = u + hg(u,y, h). Moreover, 

|*h(«)-«|< (Ci + l + ^)h\i(x)\, (15) 

and if v G BiR g (x) then 

\$h(u)-Q h (v)\<l\u-v\. (16) 

Proof Fix x G B, h < mm{H g , 4(Ci+i/6)fl g } and G B 2R 3 (x). We will apply 
Theorem 7 with X := Br 9 (x) and T(z) := u + hg(u, z, h) for all z G X. To do so we 
must show that T(z) G X for any z£l and that T is a contraction on X. It is obvious 
that X with the metric | • | (the usual Euclidean distance) is a non-empty complete 
metric space. Let z G X. Then using Assumption 1 for g, (4), u G B2R g {x) C Bn g (x), 
z G Bji g (x) and the bound on h we get 

\T(z) — x\ < \u — x\ + h\g(u, z, h)\ 

< \u — x\ + h (|/(x)| + L g \u — x\ + L g \z — x\ + L g h\i(x)\) 

<^ + h(c 1 + ^ + ^ + l)\i(x)\ W 
= ^ + HCi + \) \i(x)\ + h^\i(x)\<^. 
Hence T(z) G X. For z, z' G X, using Assumption 1 for g and h < we also get 

\T(z) -T(z')\ = h\g(u,z,h) -g(u,z',h)\ < hL g \z - z'\ < \\z - z'\ 

so that T is a contraction on X. Therefore, by Theorem 7 there exists a unique 
y G Bji g (x) satisfying y = u + hg(u,y,h). Define &h such that $/i(u) = y. 

To get (15) we use a similar argument to (17). Finally, (16) follows from the fol- 
lowing inequality where we have again used Assumption 1 for g and h < m.in{H g , qj^} 

\^ h (u) - ® h {v)\ < \u - v\ + h\g(u, <5> h (u),h) - g(v, § h (v), h)\ 
<\u-v\ + h{L g \u -v\ + L g \^ h {u) - $ h (v)\) 
< l\u-v\ + $\$ h (u)-$ h (v)\. 

Kence \$h(u) - $h(v)\ < l\u - v\. □ 
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Now we can prove that / denned by (13) satisfies Assumption 1 for some choice of 
R, L and H. 

Lemma 9. For a bounded set B C R d , let C\ be the constant from (4) and suppose that 
g satisfies Assumption 1 for some positive constants R g , L g and H g . Define R := 2R g , 

L:=max{fL fl , (Ci + ^ + SOM ° nd H := min { H 9' 4(c 1+ \/6)R g } • 

Then f : R d x R d x [0, oo) -> R d defined by f(x, x', h) := g(x, h) for all x, x' £ R d 

and h £ [0, oo) satisfies Assumption 1 for R, L and H . 

Proof. Fix x £ B, u,v,w G Br(x) and h € [0,H). Using Assumption 1 for g and 
&o(x) = x we get 

f(x, x, 0) = g(x, $ (x),0) = g(x, x, 0) = f(x). 
Using Assumption 1 for g and Lemma 8 (in particular (16)) we get 

\f(u,v,h) - f(w,v,h)\ = \g(u,$ h (u),h) - g{w,<5> h {w),h)\ 

< L g \u -w\ + L g \<Z> h (u) - <Z> h (w)\ 
<L g {\ + l)\u-w\ 

< L\u — w\. 

Trivially, \f(u,v,h) — f(u,w,h)\ = 0. Finally, again using Assumption 1 for g and 
Lemma 8 (in particular (15)) we get 

\f(x,x,h) - f(x,x,0)\ = \g(x,$ h (x),h) -g(x,x,0)\ 

< L g \$ h (x) - x\ +L g h\i(x)\ 

< (Lg{C 1 + \ + ^) + Lg)h\i{x)\ 

< Lh\i(x)\. 

□ 

Now let us consider how we should choose R, L and H so that i defined by (13) or 
(14) should satisfy Assumption 2. 

Lemma 10. For a bounded set B C R d , let R and L be positive constants such that i 
satisfies Assumption 3 for R and L, and let H be an arbitrary positive constant. Then 
i:R d xR d x [0, oo) -> R d defined by i(x, x', h) = i(x') for all x, x' € R d and h € [0, oo) 
satisfies Assumption 2 for R, L and H . 

Proof. Since i is locally Lipschitz, given arbitrary R > 0, L exists. The rest of the 
proof is trivial. □ 

Lemma 11. For a bounded set B C R d , let C\ be the constant from (4), and suppose 
that 

1. g satisfies Assumption 1 for some positive constants R g , L g and H g , and 

2. i satisfies Assumption 3 for R g and Li (given R g , Li exists since i is locally 
Lipschitz). 
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Define R:=2R g , 

L : = max |Lj, (Ci + \ + max{L g ,Lj}j and 

# := mm 4^+176)^"} ■ 

Then i:R d xR d x [0, 00) -> R d defined by i(x, x', h) := /or a// x, x' G R d and 

h G [0, 00) satisfies Assumption 2 for R, L and H . 

Proof. Let x G B, u,v,w G Br{x) and /i G [0,-ff). Since 3>o(x) = x it follows that 
i(x,x,0) = i($o(x)) = i(x). Using Assumption 3 for z and Lemma 8 (in particular 
(16)) we get 

\i(u, v, h) — i(w, v,h)\ < Ld$/j(u) — &h(w)\ < \Li\u — w\ < L\u — w\. 

Trivially, we have \i(u, v, h) — i(u, w, h)\ = 0, and using x = &o(x), Assumption 3 for i 
and Lemma 8 (in particular (15)) we get 

\i(x,x,h) -i(x,x,0)\ < Li\§ h (x)-x\ <L i (c 1 + \ + g^) h\i(x)\ < Lh\i{x)\. 

□ 

4.2 Example 2: symmetric projection method 

It is perhaps surprising that the symmetric projection method from [8, §V.4.1] (orig- 
inally in [7]) may also be written in the form of (6). In our notation, the symmetric 
projection method described in [8, §V 4.1] for x ^ x' is defined by: Given i£R d and 
h G [0, 00), compute y, z, x' G R d and /i£M such that 

y = x + fii(x), 

z = y + hg(y,z,h), (18) 
x' = z + fii(x') and I(x') = I(x), 

where g : R d x R d x [0, 00) — > R d is such that y i-> z defined by z = y + hg(y, z, h) is 
any symmetric one-step method applied to (1). If we let A = 2/u, and eliminate y and z 
from (18) then we get: Given x G R d and h G [0, 00) compute x' G R d and A G M such 
that 

x' = x + hg(x+ ±i(x),x' - ±i{x'),h) + and I(x') = I{x). 

If we let ^ be the implicitly defined mapping so that A = \l/(x, x', h) where A satisfies 

g (x + ^i(x),x' — ^i(x'), h) ■ i(x, x') 



A = -h- 



i(x)+i(x') -/ i\ 
y ' 2 y ' ■ 1{X, x'j 



where i is an arbitrarily chosen discrete gradient of /, then we see that the symmetric 
projection method is equivalent to (6) if we define i : R d x R d x [0, 00) — > R d and 
/ : R d x R d x [0, 00) R d by 

~, , , % i(x) + i(x') 
i(x,x,h):= , 

2 (19) 

f(x, x', h) :=~g(x + ^f^(x), x' - ^f^i(x'), hj 
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for all x, x' G R d and h G [0, oo). 

It turns out that the fact that g satisfies Assumption 1 is sufficient to ensure that 
/ and i satisfy Assumptions 1 and 2 respectively. This will ensure that we are able to 
apply Theorems 4 and 6 to the symmetric projection method. However, verifying that 
this is true is quite technical, so we have included it only as an appendix. 

4.3 Example 3: Methods of Dahlby, Owren and Yaguchi 

In [2] Dahlby et al. describe two projection methods. In our notation, given an arbitrary 
discrete gradient i of /, then the first of their methods (see [2, eq. 2.1]) is defined by 

y = $h(x), x' = x + P{y-x) (20) 

where <&h : R d — > R d defines an arbitrary one-step method for solving (1), and P is a 
projection onto span{i(x, x')} -1 -. In [2, §2.2] the projection matrix P is defined as 



i(x, x')i(x, x') T 



i(x, x') T i(x, x') ' 



(21) 



which is the orthogonal projection matrix onto span{i(x, x')} -1 -. If we define / : R d x 
R d x [0, oo) ->• R d and i : R d x R d x [0, oo) R d by 

f(x,x',h) := * h W~ x and i(x, x' , h) := i(x, x') (22) 

for all x, x' G R d and h G [0, oo) then it is easy to see that (20) is the same method as 
(10), so it is a special case of our general linear projection method. 

The second of the projection methods by Dahlby et al. (see [2, eq. 2.2]) is, in our 
notation and given an arbitrary discrete gradient i of I, defined by 

x' = x + hPg(x,x',h) (23) 

where P is the same projection matrix as above and g : R d x R d x [0, oo) — > R d is such 
that the map x4i/ defined by y = x + hg(x, y, h) is an arbitrary one-step method for 
solving (1). By defining / and i such that 

f(x,x',h)=g(x,x',h) and i(x, x' , h) = i(x, x) (24) 

for all x, x' G R d and h G [0, oo) we see that (23) is the same as (10), so it is another 
special case of our general linear projection method. 

Dahlby et al. call their methods 'discrete gradient methods' because the methods 
are constructed using a discrete gradient. We think it is more appropriate to describe 
these methods as projection methods. However, our theory (Theorem 2) has established 
that they may also be expressed in the form of (7) , for which we use the term 'discrete 
gradient method'. 

It is a relatively simple task to show that if for a given bounded set B C R d , g 
satisfies Assumption 1 for some positive constants R g , L g and H g , and i is a discrete 
gradient of / satisfying Assumption 2 for some positive constants Rj and Lj, then / and 
i defined by (24) satisfy Assumptions 1 and 2, respectively, for some positive constants 
R, L and H. For this reason and for the sake of brevity we omit the details. In the 
case when / and i are defined by (22) we must make suitable assumptions about the 
method defined by <&h for a similar result to hold. 
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5 Equivalence in the multiple first integrals case 



Now let us consider the case when (1) has multiple preserved integrals. Suppose that 
(1) preserves M first integrals h,..., Im, i-e. there exist I m : M. d — > R for m = 1, . . . , M 
such that for all t > 0, 

x(t) € M xo := {2£K d : J m (z) = I m (x ) for all m = 1, . . . , M}. 

For each m we use the notation, i m := VI m . Recall from Section 1 the algorithm for 
computing one step x >->■ x' of a projection method: Given and /t G [0, oo) 

1. compute y such that y = x + /t/(x, y, /i), 

2. compute x' G TW^ by projecting y onto -M x . 

For the general linear projection case when we have multiple integrals to preserve we 
first define M directions for the projection, i.e. i m : R d x R d x [0, oo) -> M d for 
m = 1,... ,M, and then replace step 2 with 

2. compute x' G R d by solving x' = y + AX such that x' G .M^ for x' G M rf and 
A G M M , where A = [i r ■ -i M ] G M <ixM and i m = i m (x, x', /»). 

We say that this type of projection is a linear projection because x 1 — y is a linear 
combination of the projection directions i m , i.e. (x' — y) G span{?i, . . . , im}- As in 
Section 1 for the single integral case, we can write the two step general linear projection 
method algorithm in one line by eliminating y and generalising /. The method is: Given 
x eM d and h G [0, oo) compute x' G R d and A G 1R M such that 

x' = x + hf(x, x', h) + AX and I m (x') = I m {x) for all m. (25) 
5.1 Oblique Projections 

Before we present our theorem showing the equivalence between linear projection meth- 
ods and discrete gradient methods for ODEs with multiple integrals we need to intro- 
duce oblique projection matrices. 

The type of projection described in detail in linear algebra textbooks is usually 
orthogonal projection, e.g. [22, Lecture 6] and [13, §5.13]. For a set of linearly indepen- 
dent vectors a\, . . . , cim G R d (not necessarily orthogonal), the orthogonal projection 
matrix that maps R d onto the subspace A := spanjai, . . . , a^} along the subspace A 1 - 
is given by 

Q = A(A T A)~ 1 A T , where A = [a v ■ ■ a M ] G R dxM . 

Since x = Qx + (I — Q)x is the unique decomposition of x G R d into vectors in A and 
A 1 - (see e.g. [22, p. 43]), it follows that 

q ± =I-Q = I- A(A T A)~ 1 A T 

is the projection matrix onto A 1 - along A. 

We would like to consider the generalisation of these two projections where the 
range of the projection is decoupled from the direction of the projection. For this we 
define an oblique projection (see e.g. [13]). Note that the space along which a projection 
projects is the null space of the projection and we may define an oblique projection by 



14 



specifying its range and null space. For a projection R let a±, . . . , a« be a basis for 
the range of R, and let 61, ... , 6m be a basis for the orthogonal complement of the null 
space of R, so that 

range(-R) = A, 
null(.R) = H ± , 

where A := spanjoi, . . . , om} and B := span{&i, . . . , 6m}- Then the oblique projection 
matrix R is given by the formula [13, eq. (7.10.39) on p. 634], 

R = A{B T A)- 1 B T 

where A = [a v ■ ■ a M ] G K dxM and B = [b v ■ ■ b M ] G M dxM . To ensure that 5 T A is 
invertible and R exists we also need that A and B 1 - are complementary subspaces of 
R d (i.e. A + B 1 - = R d and A n tf- 1 = {0}, see e.g. [13, p. 383]). More generally, in 
the terminology of [13], if M = AB T is a full rank factorisation of M G M dxM and if 
range(M) and null(M) are complimentary subspaces, then R = A(B T A) -1 B T is the 
projection onto range(M) = range(-A) along null(M) = mill(B T ). Note that A and B T 
have full rank if the columns of A are linearly independent, and the rows of B are 
linearly independent (see e.g. [13, p. 218]). 

The following proposition will help us decide whether or not A and B 1 - are comple- 
mentary subspaces. 

Proposition 12. With cti, . . . , aM, b±, ... , 6m , A, B, A and B defined as above, then 
A and B 1 - are complementary subspaces if and only if B T A is invertible. 

Proof. From [13, p. 383] we have that A and B 1 - are complimentary subspaces if and 
only if for any x € M. d there exists a unique decomposition x = xj± + x B i where xj± € A 
and x B ± € B L . 

Assume that A and B 1 - are complementary subspaces. Then there exists a unique 
decomposition x = x^ + Xgx and since a±, . . . , om is a basis for A there exists a unique 
v € M M such that x^ = Av. Therefore, B T x = B T x^ = B T Av and since v is uniquely 
determined given x, B T A is invertible. 

Conversely, suppose B T A is invertible. Then the matrix R = A{B T A)~ 1 B T is 
well defined and for a given x € M d , := Rx € .4 and Xg± := (I — R)x £ i? -1 - 
defines a decomposition x = x^ + x B x . To complete the proof we must show that this 
decomposition is unique. Suppose x = + y B ± where y_\^ A and y B ±_ G B -1 defines 
another decomposition of x. There exists a unique tt> G M M such that y_4 = Aw. Then 
i? T x = B T yj{ = B T Aw and hence w = (B T A)~ 1 B T x. Substituting this into = Aw 
we get y_4 = i?x = xj± and y e ± = Xg± and the decomposition is unique. □ 

An obvious choice for B so that B T A is invertible is B = A. Since A has full rank, 
A A is positive definite and invertible. But this corresponds to orthogonal projection. 
More generally, if B is sufficiently "close" to A then B T A is positive definite and hence 
invertible. If for any m, b m G A 1 - then B T A is not invertible since it has a column with 
all zeros. 

Another projection matrix R± with 

range (R ± ) = B L 
null(i?_L) = A 
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may be defined by 

R ± :=I-R = I- A(B T A)- 1 B T . 

We will use R± to define an alternative formulation for linear projection methods for 
ODEs with multiple first integrals. 

5.2 Equivalent formulation using an oblique projection matrix 

A general method for solving (1), 

x' = x + hf(x,x',h) (26) 

is integral preserving if and only if / € spanjii, . . . , i^} 1 where each i m is a discrete 
gradient of I m . This fact follows from the definition of a discrete gradient: For each 
m, if i m is a discrete gradient of I m , then / _L i m if and only if 

I m {x) - I m (x) = {x - x) ■ i m = hf ■ i m = 0. (27) 

However, in general, we do not have / G spanjii, . . . , im} ± - Therefore, a way of 
constructing an integral preserving matrix is to modify (26) to 

x 1 = x + hPf(x, x 1 , h) 

where P = P(x, x', h) is chosen to be a projection matrix with range(P) = spanjii, . . . , im} 
so that Pf G spanjii, . . . ,im} ± - It turns out that constructing an integral preserving 
method in this way is equivalent to a general linear projection method of the form 
(25). This equivalence is formalised in the following theorem and is an extension to 
our earlier Theorem 2 (in particular the equivalence between (9) and (10)). 

Theorem 13. Let f : R d x R d x [0, oo) ->• R d be a consistent approximation of f and 
for each m = 1, . . . , M let i m : R d x M. d x [0, oo) — > M. d be a consistent approximation 
of i m and let i rn be an arbitrary discrete gradient of I m . Define 

P := I - A(B T A)- 1 B T (28) 

where 

A:=[i v --i M ]eR dxM and B := [i v • • i M ] € R dxM , 
and i m = i m (x , x' , h) and i rn = i m (x,x'). Assume that 

1. each of the two methods below have uniquely defined x' G R d and A G R M for 
sufficiently small h, 

2. {h, ... ,im} an d {h, ■ ■ ■ ,1m} are linearly independent sets (so that A and B T 
have full rank), and 

3. S and S ± are complementary subspaces ofR d , where S = spanjii, . . . ,?m} and 
<S = span{n,... ~i M } (i.e. S + S ± = R d and S n S ± = {0}J. 

Then the following expressions describe the same linear projection method. 

x' = x + hf(x, x', h) + A\ such that I m {x') = I m (x) for all m = 1, . . . , M, (29) 

and 

x' = x + hPf(x,x',h). (30) 
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Proof. Conditions 2 and 3 in the theorem ensure that P exists (B A is invertible), 
range(P) = S and null(P) = S (see discussion in previous section about oblique 
projection matrices). 

For given x € M. d and h sufficiently small, suppose x' <G M. d and A € R M satisfy (29). 
For each m, since i m is a discrete gradient, 

= I m (x') - I m (x) = i m (x,x') T (x' - x) = hi T m f + i T m A\. 

Therefore, 

= hB T f + B T A\ A = -h(B T A)- 1 B T f. 

Substituting this into (29) we get 

x > = x + hf- hA(B T A)~ 1 B T f = x + hPf, 

so x' satisfies (30). 

Conversely, for given x G M. d and h sufficiently small, suppose x' € M. d satisfies 
(30). We know that range(P) = S^. Therefore, Pf G S 1 - and by (27) we have 
I m {x') = I m (x) for all m = 1, . . . , M. Finally, if we define A = -h(B T A)- l B T f then 

x' = x + hPf = x + hf- hA(B T A)' 1 B T f = x + hf + AX 

and x' and A satisfy (29) as required. □ 

Since i m and i m are consistent approximations of i m it follows that if i m (x) for 
m = 1, . . . , M are linearly independent then for sufficiently small h, both . . . , {m} 
and . . . ,im} are linearly independent sets. Moreover, for small h the matrix B is 
"close" to A and the property that S and S 1 - are complementary subspaces of M. d is 
satisfied (see discussion after Proposition 12). 

Now that we have established an alternative formulation for general linear projec- 
tion methods we can explore their relationship to discrete gradient methods. 



5.3 Equivalence with discrete gradient methods 

In this section let us consider discrete gradient methods for preserving more than one 
integral. Our aim is to construct a general discrete gradient method to approximate the 
solution to (1) such that M integrals are simultaneously preserved, and then determine 
which discrete gradient methods are equivalent to linear projection methods. 

According to [12, Prop. 2.14] (see also [18] for the two integral case), we may write 
(1) as 

%3 = Sjji32 -3m *n*j2 ' ' ' f° r eaC ^ 3 = ' ' ' ' ^' (31) 

using Einstein's summation principle for repeated indices, where i m := V/ m for each 
m and 

where D : R d -»■ R dxM is defined by D = [t 1 - • • i M ]. See e.g. [3, Chap. 1] or [6] for a 
definition of the anti-symmetric A product from exterior algebra. 

Based on the expression for the ODE given in (31) we can write down a general 
discrete gradient method for solving (1). Let S = S(x,x',h) be an anti-symmetric 
consistent approximation of S and for each m let i m = i m (x,x') be a discrete gradient 
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of I m . Then, the method x x' is defined as: Given x G M d and h G [0, oo), x' G M d 
satisfies 

— X ' . - 

J fe J = S 3hh -3m\ " " " *2r fOT each J = 1, ■ ■ ■ , d- (32) 

For particular choices of S, this discrete gradient method is equivalent to a linear 
projection method. To prove our result we will need the following proposition (see [6, 
eq. 5.10 on p. 106] where the pair of dual spaces are M. d and itself with the usual 
Euclidean inner product). 

Proposition 14. For arbitrary M,d G N, let U, V G R dxM be two matrices with 
columns u 1 , . . . , u M G M. d and v 1 , . . . , v M G M d respectively. Then 

(n 1 A u 2 A- • • A u M ) nn ... jM v) x -- ■ vf M = det(V T U). 

Theorem 15. Let f = f(x,x',h) be a consistent approximation of f(x), and for each 
m = 1,...,M let i m = i m (x,x',h) be a consistent approximation of i m (x) and let 
jm _ fr e a discrete gradient of I m (x). Define 

A= [i 1 - ■ ■ i M ] G M. dxM and B = [?••• i M ] G R dxM . 

Assume that 

1. the discrete gradient method and the linear projection method defined below have 
uniquely defined x' G R d and X G M M for sufficiently small h, 

2. {i 1 , . . . , i M } and {i 1 , . . . , i M } are linearly independent sets (so that A and B have 
full rank), and 

3. S and S ± are complementary subspaces of M. d where S = spanji 1 , . . . , i M } and 
5 = span{i 1 ,...,i M }. 

If we define 

S =dS(W A?A "' Ai " (33) 

then the discrete gradient method defined by (32) and (33) is equivalent to the linear 
projection method defined by (29) or (30). 

Proof. To show that these methods are the same we must show that 

for each j = 1, . . . , M, where P = I — A(B T A)~ l B T . Equivalently, we can show that 

(Pf) ■ v = Sjj 1 j 2 ...j M Vjij 1 ij 2 - ■ ■ ij M , 

for any v G M. d . Let v be an arbitrary vector in Using Proposition 14 and expanding 
the determinant along the first row we get 



(/ A i 1 A - • • A ^ M h nn - m v j ^ 1 z%---zjl = det {[vi 1 - ■ ■ i M ] T [fi 

= det 



v-f 


~1 ~M 1 
V ■ I ■ ■ ■ V ■ I 


B T f 


B T A \ 




M 



(v ■ f) dat{B T A) + ^2(-l) j (v ■ i j ) det(B T A j ), 
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where Aj = [f i 1 - ■ ■ p- r P +l - ■ ■ ~i M ] £ 



ndxM 



Using the fact that the determinant of a 



matrix is anti-symmetric (each column swap introduces a factor of —1) it follows that 



det(B T Aj) = (-iy det(B T A j 



where A a 



l...~ij-lfij+l...~iM 



IxM 



i.e. the matrix A with P replaced by /. 



Hence, using the two identities above and Cramer's Rule (see e.g. [13, p. 476]) we get 

M 



71 72 



-■M 



det( gr A) ( (« • /) det(S r A) - J2(v ■ i j ) det(B T A 3 



M 



det{B T A j )~. j 



% det(BTA) 

M 

f-J2[(B T A)-'B T f 
(f-A(B T Ay l B T f) -v = (Pf) 



v. 



□ 



If we restrict ourselves to the situation where only two integrals are preserved, I 
and J, with i := VI and j := VJ, then (1) may be written as (see [18]) 



^ Simnimjn for each l = l,...,d, 



(34) 



m,n=l 



where Si mn is an anti-symmetric tensor given by 



Slmn — 



fi k ji 

fm im jn, 
fn in jn 



and 



N=(i-i)(j-j)-(i-jf. 



The general discrete gradient methods for (34) that will preserve both I and J are 

x\ - Xl 



h 



— 1 Si mn i m j n 

m,n=l 



(35) 



where i and j are discrete gradients of / and J respectively, and S is a skew-symmetric 
consistent approximation of S. If we define 



1 

N 



fl H 3l 

fm ^m jm 
fn "in jn 



and 



N=(i-i){j-~j)-(i-~j)(l-~j) 



where / is a consistent approximation of /, % and j are consistent approximations of i 
and j respectively, then this discrete gradient method is a linear projection method. 

We remark that (29) and (30) do not depend on any discrete gradients of I m . 
Therefore, we may conclude from Theorem 15 that each projection method (defined 
by the choice of projection directions) is equivalent to a class of discrete gradient 
methods where the approximate solution values at each time step are independent of 
the particular choices of discrete gradients used in the discrete gradient methods. 
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5.4 Existence, uniqueness and order of accuracy 

For the single preserved integral case we could use theory from discrete gradient meth- 
ods to prove, under certain local Lipschitz continuity and consistency conditions, the 
existence of a unique solution x' at each time step of a projection method for sufficiently 
small h. We were also able to show under the same conditions that a projection method 
retained the same order of accuracy as the underlying method. For the multiple inte- 
gral case we cannot do this because these results for discrete gradient methods are not 
yet available. We do not anticipate that extending these results to the multiple integral 
case poses any real difficulty, except that a proof may be very lengthy. In Section 6 
we will try to test numerically whether or not it is correct to assume that these results 
hold in the multiple integral case. 

5.5 Special cases of projection methods 

As in Section 4 we can show how our expression for a general linear projection method 
(25) for preserving multiple first integrals encompasses all existing (as far as we are 
aware) projection methods. Unlike Section 4 however, we will not go into all of the 
technicalities regarding existence of a unique solution for sufficiently small time step. 
Once again, the trick to seeing how other projection methods fit into our framework is 
to make the right choice for the projection directions i m . 

Example 1 revisited: (non-symmetric) standard projection method. 

The first method we consider is again the (non-symmetric) standard projection 
method described in [8, Algorithm IV 4.2]. In our notation, their method in the 
multiple preserved integral case for x >->■ x' is defined by solving the following system 
of equations for x' G R d and A G R M , given x G R d and he [0, oo), 

y = x + hg(x,y,h), 

x' = y + A\ and I m (x') = I m (x) for every m = 1, . . . , M, 

where the map x i-> y defined by y = x+hg(x, y, h) defines an arbitrary one-step method 
applied to (1) and A = [h(x')- ■ ■ im(x')] G R dxM . If we let ®h be the implicitly defined 
map so that y = &h(x) then this method has the form of (25) if for each m we define 
i m : R d x R d x [0, oo) R d and f :R d xR d x [0, oo) R d such that 

i m (x, x', h) := i m (x') and f(x, x' , h) := g(x, $ h (x),h) 

for all x, x' G R d and h G [0, oo). In [8], the authors suggest instead using i m (x, x' , /i) : — 
im(y) to reduce the number of evaluations of i m {-) required when solving the system 
of equations at each step using a simplified Newton method. 
Example 2 revisited: symmetric projection method. 

The multiple first integral version of the symmetric projection method (see [8, 
§V.4.1] or [7]), in our notation for x x', is: Given x G R d and h G [0, oo), compute 
y, z, x' G R d and fi G R M such that 

y = x + A"n, 

z = y + hg{y,z,h), 

x' = z + A'fi and I m (x') = I m (x) for each m = 1, . . . , M, 
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where y i— >■ z defined by z = y + hg(y, z,h) is a symmetric one-step method applied 
to (1), A" := [h(x)- ■ ■ i M (x)] G R dxM and A' := [n(x')- • • i M {x')} G M dxM . If we let 
A = 2/j, and A = \{A" + A'), and eliminate y and 2, then we can write the method in 
one line: Given x G R d and h G [0, oo), compute x' G R d and A G IR M such that 

x = x + hg (x + 7}A"\, x — \A'\, h) + AX and I m (x') = I m {x) for each m. 

Let \E' be the implicitly defined map so that \l/(x, x', h) = A where A satisfies 

A = -h{B T A)~ 1 B T ~g (x + \A"\ x' - \A'\ h) 

where B = [i r ■ ~i M ] G !R dxM for some arbitrarily chosen discrete gradients i m — 
i m (x,x') of I m . It is now clear that this method may be written in the form (25) if for 
each to we define i m : R d x R d x [0, oo) -)■ R d and / : R d x M d x [0, oo) M d such that 

i m (x, x', /i) := \{i m {x) + i m (x')) and /(x, x', /i) := z, h) 

for all x,x' G R d and /i G [0, oo) where y := x + ^"^(x, x', /i) and z := x' — 
\A'^{x,x',h). 

Example 3 revisited: Methods of Dahlby, Owren and Yaguchi. 

The multiple integral preserving form of these methods is obtained by replacing the 
projection matrix P defined earlier in (21) with the orthogonal projection matrix 

P = I - B(B T By 1 B T 

where B = - ■ ■ im] and each i m = i m (x, x') is an arbitrarily chosen discrete gradient 
of I m . With this new P the first of the Dahlby et al. methods is (cf. (20)) 

y = $ h {x), x' = x + P(y-x), 

where <&h defines an arbitrary one-step method for solving (1). This method has the 
form of a general linear projection method (25) if for each to we define i m : R d x R d x 
[0, oo) -> R d and / : R d x R d x [0, oo) -> R d such that 

i m (x, x', h) := i m (x, x') and f(x, x' , h) := — — - 

for all x, x' G Mr and h G [0, oo) where each i m is an arbitrary discrete gradient of I m . 
The second Dahlby et al. method uses the same choice of i m , but a different / (defined 
earlier in (24)). 

6 Numerical Examples 

In this section we use a numerical example to provide evidence that the same results 
for preserving a single first integral, also hold for methods that preserve multiple first 
integrals. In particular we will show: 

1. many possible projection directions can be used to define a projection method 
that preserves the order of accuracy of the underlying method; and 

2. the choice of discrete gradient for discrete gradient methods that are equivalent to 
projection methods does not change the approximate solution in exact arithmetic, 
but there may be differences in finite precision arithmetic. 
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The example we use is Kepler's two-body problem in cartesian coordinates (see e.g. 
[8, §1.2] and [2]). We will consider the case where either two or three integrals are 
preserved (the fourth integral is not functionally independent). 

Kepler's two-body problem in the form of (1) is 





Xl 




X3 
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X2 




X4 


dt 


X3 




— xi/r 3 




X4 




— X2/V 3 



t > 0, 



where r = {x\ + x|) 1//2 , so that f(x) := (x3,X4, — xi/r 3 , —X2/r 3 ) T . This system models 
two bodies that attract each other, with one body at the origin and the second body 
at position (x±,X2) with velocity (or momentum if the body has mass 1) (x3,X4). The 
variable r is the distance between the two bodies. The exact solution to Kepler's 
two-body problem preserves four first integrals, 
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These integrals are energy, angular momentum, and the two components of the Runge- 
Lenz-Pauli vector respectively. As in [8, p. 12], we use the initial condition 



x = (l-e,0,0,y±±§) 



(36) 



for some e € [0, 1) so that the exact solution has period 2ir. The exact solution can be 
found by integrating equation (2.10) in [8, p. 11] but we will use a very accurate solution 
computed with Matlab's ODE45 and very small tolerances as a reference solution in 
our examples. 

In Figure 1 we have computed the solution to the Kepler two-body problem with 
initial condition (36) using e = 0.6 for several projection methods that differ according 
to which underlying method is used to define / and which projection direction is used 
to define i. We have used the classical explicit 4 th order and 6 th order Runge-Kutta 
methods (RK4, see e.g. [8, p. 30], and RK6, see [1, p. 194], respectively), with 
coefficients defined by Butcher tableaux: 
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Phase space 
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Figure 1: Phase space and order plots for the RK4 and RK6 methods and several 
projection methods that preserve three integrals. Methods a - d and a6 - d6 use 
different projection directions. 



Define / using these coefficients by / := h^2t=i biki where bi and fej are defined as in 
[8, eq. (1.4) on p. 29]. Note that we do not require the coefficients indicated by * in 
the tableaux because we are solving an autonomous ODE. 

We then define projection methods a, b, c, d, a6, 66, c6 and d6 for preserving three 
integrals (M = 3) by (25) by choosing RK4 to define / for methods a — d and RK6 to 
define / for methods a6 — d6. The projection directions for these methods are defined 
by 



methods a and a6 
methods b and 66 
methods c and c6 
methods d and d6 



i m (x'), 



i m (x), 



i m (y) where y = x + hf(x, h), 



i(z m (*)+z"V)). 



We modify (25) in our computations to prevent the value of I m drifting due to finite 
precision arithmetic. Instead of requiring I m (x') = I m (x) at each time step, we require 
that I m (x') = I m (xo) for each m. 

In the phase space plot of Figure 1 (left) we see that method 6 does indeed keep the 
approximate solution on the ellipse while the RK4 approximate solution drifts away 
from the ellipse. In this plot we computed the solution up to a final time of t = 50-7T 
(25 periods) with h = |g . 

In the order plot of Figure 1 (right) we see the more important result that methods 
a — d all seem to preserve the 4 th order convergence of the RK4 method and methods 
a6 - d6 all seem to preserve the 6 th order convergence of the RK6 method. In fact, 
we see that the different choices for the projection direction seem to make very little 
difference to the error because the errors are essentially the same size for methods a - 
d and methods a6 - d6 respectively (the lines in the plot overlay each other). For this 
plot we computed up to a final time of t = 2ir (only 1 period) for a range of step sizes 
in [10 -3 - 5 ,10 _1 ]. 
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Figure 2: Left: A plot of the difference between method 6 to methods 61 and 62 for 
increasing time. Right: A plot of the integral error of I\ and I2 for methods 61 and 62 
as time increases. 

In Figure 2 we have computed the solution to the Kepler two-body problem using 
three methods that are equivalent in exact arithmetic according to our theory - a 
projection method, and two discrete gradient methods defined using different choices 
of discrete gradient. To avoid making the discrete gradient methods overly complicated 
we only consider the case when two integrals, I\ and I2, are preserved. The projection 
method we use is method 6 as defined above except now we only preserve two integrals. 
We define methods 61 and 62 by (32) and (33) with M = 2, the same choices of / and 
i m as for method 6, and if j m denotes the coordinate increment discrete gradient of 
7 m (see [11] or [12]) then i m in methods 61 and 62 is defined as 

method 61: i m = i m (x, x') = j m (x, x'), 

method 62: i m = i m (x,x') = \Q m {x, x') + j m (x', x)). 

In exact arithmetic, according to Theorems 13 and 15 these methods should be the 
same. However, in finite precision arithmetic we notice some small differences. 

In the left plot of Figure 2 we have compared methods 61 and 62 with method 6 
for increasing time. Since computations are done in finite precision arithmetic and the 
nonlinear systems at each time step are only solved to a tolerance of 10 -14 we expect 
to see that the differences between these methods grows linearly with time. Perhaps 
surprisingly we actually see quadratic growth in the difference between these methods. 
In the right plot of Figure 2 we have plotted the error of I\ and I2 for the approximate 
solution as time increases for methods 61 and 62 (method 6 is constructed to keep the 
integral error below 1CP 14 , the tolerance that we solve the nonlinear systems at each 
time step). We see linear growth of the integral errors as time increases, as expected. 
The plots in Figure 2 used the same initial condition as above, a time step of h = |^ 
and a final time of t = lOChr (50 periods). 
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7 Conclusions 



In this paper we determined the relationship between linear projection methods and 
discrete gradient methods for ODEs with conserved first integrals. A consequence of 
our theory is that each linear projection method is equivalent to a class of discrete 
gradient methods. A further consequence when there is only one first integral to pre- 
serve is that we can use theory from discrete gradient methods to prove results about 
projection methods. We have shown that under only mild conditions on the continuity 
and consistency of the projection direction we obtain a projection method that has a 
well-defined approximate solution provided the time step is sufficiently small, and for 
arbitrary p € N also preserves the order of accuracy of an underlying method of order 
p. Moreover, the condition on the projection direction does not depend on p. For the 
multiple first integral case we rely on numerical experiments to confirm that similar 
results appear to also hold in this case. 
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A Example 2 continued from §4.2 

Continuing on from the end of §4.2 we seek to show that g satisfying Assumption 1 
is sufficient to ensure that both i and / defined by (19) satisfy Assumptions 1 and 2 
respectively. We begin by verifying that i defined by (19) satisfies Assumption 2 for 
some choice of R, L and H. 
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Lemma 16. For a bounded set B C M. d , suppose that i satisfies Assumption 3 for 
some positive constants R and Li, and let H be an arbitrary positive constant. Define 
L := \Li. Then i : R d x R d x [0, oo) -»• R d defined by i(x,x', h) := \{i{x) + i{x')) for 
all x, x' G R d and h G [0, oo) satisfies Assumption 2 for R, L and H. 

Proof. Since i is locally Lipschitz continuous, Lj exists for any given R > 0. Let x G B, 
u,v,w G Br(x) and h G [0, oo). Trivially, we get i(x,x,0) = i(x) and \i(x,x,h) — 
i(x, x, 0)| = 0. Using Assumption 3 for i we also get 

\i(u, v, h) — i(w, v,h)\ = \\i{u) — i(w)\ < \Li\u — w\ = L\u — w\. 

Similarly, we get \i(u,v,h) — i(u,w,h)\ < L\v — w\. □ 

Verifying that / denned by (19) satisfies Assumption 1 requires a much lengthier 
argument. We first prove the following lemma to describe the properties of ^. 

Lemma 17. For a bounded set B C R d , let C\ be the constant from (4), and suppose 
that 

1. g : R d x R d x [0, oo) — > R d satisfies Assumption 1 for positive constants R g , L g 
and H g , 

2. i : R d x R d x [0, oo) — >• R d satisfies Assumption 2 for positive constants Rj, L- { 
and H^, 

3. i is a discrete gradient of I satisfying Assumption 2 for Rj and Iq, and 
4- i satisfies Assumption 3 for R g and a positive constant Li. 

Define 

R x :=max{2R g ,R~ i ,12L~ i ,L i ,4Lg}, and H x := min |g g , H h ^ 7(Ci + 2)Ra } ■ 

For any x G B such that i(x) ^ 0, any u,v G Br x (x) and any h G [0, H\), there exists 
a unique A = ^>(u,v,h) G M. such that |A| < satisfying 

^ g (u + ^i(u),v - ±i(v), h) ■ i(u, v) 
i(u, v, h) • i(u, v) 

Moreover, 

\V(u,v,h)\ < |(Ci + 2)/i, (37) 

and if w G Br x (x) then 

\^{u,v,h)-^{w,v,h)\<l ¥ ^ Tl \u-w\, 

_ -, (38) 
\V(u,v,h)-*(u,w,h)\<%^\v-w\. 

If x G B such that i(x) = 0, then define A = ^f(u,v,h) := for all u, v G Br x (x) and 
h€[0,H x ). 
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Proof. Note that since i is locally Lipschitz continuous, Li exists for any R g > 0. Fix 
x G B such that i(i)/0, m,d£ B Rx (x) and h G [0, #a)- To get the result we will use 
Theorem 7. Define I := {7 e R : | — y | < -^-} (which with the Euclidean norm | • | is a 
non-empty complete Metric space) and T : X — s> R by 

T( 7 ) := _^(" + W^-^U)-i(u,t;) for ^ 7 g x 
i(u, v, h) ■ i(u, v) 

To apply Theorem 7 we must show that T(j) G X for each 7 G X and that T is a 
contraction on X. 

Using Assumption 3 for i and R\ > Li we get the following useful inequality, 

\i{u)\ < \i(x)\ + Li\u - x\ < (l + \i(x)\ < 2\i(x)\. (39) 

Fix 7^1 and define y := u + ^i(u) and z := v — ^i{v). Using (39), | — y | < and 
u G Br x (x) we get 

\y - X \ < |„ - X | + JJ| i(u )| < | U - X | + | 7 || i(x )| < + ^) \i( X )\ = 

Hence y G B Rx / 2 (x), and since -Ra/2 > -R 9 we have y G B Rx / 2 C B Rg {x). Similarly, 
2; G B Rx / 2 C B Rg (x). Using this, Assumption 1 for g, (4), i?A > 4L 9 and /i < 2^77- < 77- 
we get 

\g{y,z,h)\ < \g(x,x,0)\ + L fl (|y - x\ + \z - x\ + h\i(x)\) 

<(Ci + 2 -t + 2 -t + L g h) \i(x)\ < (d + 2)K(x)|. (40) 

Using Assumption 2 for i, and i?A > I2L7 we get 

\i(u,v)\ < \i(x,x)\ + L.(\u-x\ + \v-x\) <(l + ^fj \i(x)\ < l\i(x)\. (41) 

Using Assumpiton 2 for i and i, (41), i?A > 12L^ and h < gjz we get 

i(u, v, h) ■ i(u, v) = \i(x)\ 2 + (\i(u, v, h) — i(x, v, h)] + [i(x, v, h) — i(x, x, h)] 
+ [i(x, x, h) — i(x, x, 0)]j ■ i(u, v) 
+ i(x) ■ ( [i(u, v) — i(x, v)] + [i(x, v) — i(x, x)] 



> \i(x)\ 2 - L-(\u -x\ + \v-x\ + h\i(x)\)l\i(x)\ ^ 

— \i(x)\L- i (\u — x\ + \v — x\) 

>l^)| 2 i}-Hw x +h)l-L^ 

> \t{x)\ 2 (1 - (| + \)l -\) = §-^(x)\ 2 > \\i{x)\ 2 . 

We can now show that |T(7)| < Using (40), (41), (42) and h < 7(Ci + 2)jRa we § et 
\^)\ = h ^^ <UC 1 + 2)h<^. (43) 



Hence T(j) G X and so T : X — >■ X. It remains to show that T is a contraction. 

|z(ii) and z' := v — |i 



Let 5 G X and define y' := u + ii{u) and z' := v — ii(v ). As above we have y' , z' G 
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B Rx / 2 (x) C Bn g (x). Using (42), (41), Assumption 1 for g, (39) (which also holds for 
|«(v)|) and h < ^j- we get 



[g(y,z,h)-g(y' ,z' ,h)]-i(u,v) 
i(u,v,h)-i(u,v) 
7h \~/_. .. l> J 



\T{ 1 )-T{5)\ = h 

- Wx)\ ls(y> h ) - 9(y', z ', h )\ 

7hL a 



<m\(\y-y'\ + \^-A 



< 



IhLg 
- e|*(as)| 
7hL 



| 7 -*|(|i(u)| + !<(«)!) 



< 7 -^h-s\<lh-s\, 

so T is a contraction. Therefore, applying Theorem 7, there exists a unique A € X 
such that T(A) = A. Define *$>(u,v,h) := A. Inequality (37) then follows from (43). 
Now let w € Br x (x) and define A := ^(u, f,/i), 7 := ^>(w,v,h) and 



y := u+ 



2' 



y' := w+ %i(w), 
z' := v — -%i(v). 



z := v — T%i(v), 

Using Assumption 3 for i, |A| < (39) and i?A > £j we get 



|j/ - y'\ <\u-w\ + l$\i(u) - i(w)\ + M|A - 7 | 

< (i + i) |«-H + l*0»0l|A-7l 

< ||u - to I + |i(a;)||A - 7 |. 

Likewise, \z — z'\ < ^^-|A — 7 | < |i(x)||A — 7 |. Using these two inequalities, together 
with Assumption 1 for g, and noting that y,y',z,z' € B Rx / 2 (x) C Br 3 (x) we get 

\g(y, z, h) - g(y', z' , h)\ < L g (\y - y'\ + \z - z'\) < L g (§kx -w\ + 2\i(x)\\\ - 7 |) . (44) 

Using Assumption 2 for i, u,v G Br x (x), R\ > 12Lj and /i < <^ we get 



|i(«, u, /i)| < |i(x)| + L^(|u — x| + \v — x\ + h\i(x)\) 



^) l»(x)| 



(45) 



< (1 + 1 + l)\i(x)\ = §\i(x)\ < l\i(x)\. 



The same inequality holds for \i(w,v,h)\. Using (41) and (42) (which also hold with u 
replaced by w), (45), and Assumption 2 for i and i we get 



1 1 _ 

i(u,v,h)-i(u,v) i(w,v,h)-i(w,v) 



i(w,v,h)-i(w,v)—i(u,v,h)-i(u,v) 

(i(u,v,h)-i(u,v))(i(w,v,h)-i(w,v)) 

[i(w ,v ,h) —i(u,v,h)]-i(w ,v)+i(u,v,h)-[i(w ,v) —i(u,v)] 
(i(u,v,h)'i(u,v))(i(w,v,h)'i(w,v)) 

- (\ i ( u ^ v ^ h ) - ^i^)! 11*0*01 + |l*0*0IKK u ) 

< 1 -Aim (IL~-\u — w\ + I-Z/-|m — itfl) = rrrfel'" 

— |i(x)| J V6 ii 1 3 ii 1/ p( x )l 1 



»(«>,«) I) 



(46) 
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Now using (42), (41), (40), (44), Assumption 2 for i, (46), h < minl^^-, 7(d+2)R x } 
and R\ > 12L^ we get 



|A-7l =h 
=h 



g(y,z,h)-i(u,v) _ g(y' ,z' ,h)-i(w,v ) 
i(u,v,h)-i(u,v) i(w,v,h)-i(w,v) 

[§(y,z,h)-g(y' \z' ' ,h]-i{u,v) _|_ g(y' ,z' ,h)-[i(u,y)-i(w,v)] 
i(u,v,h)-i(u,v) i(u,v,h)-i(u,v) 

+ g(y',z',hyi( w , v) - J —± 1 — S j 



— "I 6 K( a: )ll9fa' z ' fc )~gfa'' z '' fe )l+ |i(V)p (Ci+2)|i(a:)||t(M,u)-i(^,i;)| 



</i 



+ (Ci+2)|i(x)|||t(a 



i(u,v,h)-i(u,v) i(w,v,h)-i(w,v) 
' 1 2 ( Ci H- 2 ) 



(47) 



+ 



7(Ci+2)|»Qr)| 2 



1 



<h 



i(u,v,h)-i(u,v) i(w,v,h)-i(w,v) 

TL„ /3, , , 1 1 . ,\ 2(Ci+2) r , , 7(Ci+2)|j(x)| 2 10L; 

3[^(f|u-t |+2|,(»)||A-7|) + -^^£ ? |t,-t |+ 6 ;K |i(x) | 



$\u-w\ 



(TLgh . 41Lr(Ci+2)h\ i | | 14L g , u , 

<(l + 3)^K-H + ^|A-7l 



It then follows that \^(u, v, h) — ty(w, v,h)\ < | |tt — The second inequality in 
(38) is derived using a similar argument. □ 

Using Lemma 17 we can easily derive the following lemma to ensure that / for the 
symmetric projection method (defined by (19)) satisfies Assumption 1. 

Lemma 18. For a bounded set B C M. d , define C\, g, R g , L g , H g , i, Lq, Hj, i, Li, 
R\ and H\ as in Lemma 1 7. In addition, define 

L:=max{4L g , ( 7Cl+ 3 17 ^ }. 

Then f : M. d x R d x [0, oo) -> R d , defined as in (19), 

f~(x, x>, h) :=g(x+ i(x)j x , _ 5^7^^ ^ , 

for all x, x' £ M d and /i € [0, oo) satisfies Assumption 1 for R\, L and H\. 

Proof. Since the assumptions of Lemma 18 are the same as Lemma 17 all of the results 
in Lemma 17 hold here. Fix x G B, u,v,w € Br x {x) and h € [0, H\). Since *(x, x, 0) = 
(by Lemma 17) it follows from Assumption 1 for g that f(x, x, 0) = g(x, x, 0) = f(x). 
Using Lemma 17 we can define A := ^(u, v, h) and 7 := ^f(w, v, h), and 



y := u + £i(it), 
z := v — ^i{v), 



y' :=w+ %i(w), 
z' := v — l}i(v). 



30 



From (44) in the proof of Lemma 17, and (38) we get 

\f(u,v,h) - f(w,v,h)\ = \g(y,z,h) -g(y',z',h)\ 

<L fl (||«-H + 2|i(x)||A-7|) 

< L g (| + |) \u — w\ = 4:L g \u — w\ < L\u — w\. 

Similarly, we can show that \ f(u,v,h) — f(u,w,h)\ < L\v — w\. 

Now define A := ^(x, x, h), y := x + |i(x) and z := x — | i(x). Using ^(x, x, 0) = 0, 
Assumption 1 for g and (37) we get 

\f(x,x,h) - f(x,x,0)\ = \g(y,z,h) -g(x,x,0)\ 

< L g (\y -x\ + \z-x\ + h\i(x)\) = L g (\X\ + h)\i(x)\ 

< L g (Kd + 2) + 1) h\i(x)\ = {7Cl+ 3 17)L3 h\i(x)\ < Lh\i(x)\. 
Hence / satisfies Assumption 1 for R\, L and H\. □ 
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